In this application medical images of different modalities are used for image analysis and fusion. For final fusion MRI (Magnetic Resonance Imaging) and CT (Computed Tomography) images are taken as input and which gives fused image as output. This fused image has more information than individual images. Proposed method uses VMD (Variational Mode Decomposition) decomposition method and IMF method for fusion. Final fused image is compared with individual MRI and CT image using different metrics such as entropy, mutual information, edge intensity, PSNR, SSIM, MSE etc. Proposed method used CNN (Convolutional Neural Network) for analysis of fused image and to predict the final results as a normal scan or scan with tumor. Proposed CNN model finds results very perfectly than state of art machine learning models.
Introduction
Medical imaging is crucial for early disease detection, but relying on a single imaging modality (like X-ray, MRI, or CT scan) can lead to inaccurate or delayed diagnosis. To overcome this, the proposed system introduces a multimodal AI-based medical imaging platform that combines different imaging types—and potentially audio signals—to improve diagnostic accuracy and speed.
The system is a secure, web-based platform that allows medical professionals to upload and analyze multiple medical images in one place. It uses deep learning models (such as CNNs and Vision Transformers) to extract features from images, perform preprocessing (normalization, noise reduction), and apply multimodal fusion to combine information from different imaging sources.
The fused data is then used for abnormality detection and classification, providing prediction labels along with confidence scores. To improve trust and transparency, the system generates heatmaps (using techniques like Grad-CAM) that highlight affected regions in the images.
The platform emphasizes usability, secure data handling, and accessibility, making it suitable for hospitals and remote healthcare settings. Performance is evaluated using standard metrics like accuracy, precision, recall, and F1-score.
Overall, the system enhances diagnostic accuracy, reduces human error, speeds up decision-making, and demonstrates the potential of AI-driven multimodal analysis in modern healthcare.
Conclusion
Proposed method successfully designed using python and relevant dataset. The dataset contains MRI and CT scan of the same person with the similar part. VMD decomposition and IMF are used for fusion from input MRI scan and CT scan. The performance of proposed fusion is calculated using different metrics such as PSNR, SSIM, Entropy, MSE etc. Further fused image is passed to CNN algorithm for analysis. CNN image is first trained with normal images and tumor images. The new fused test image is fed to CNN algorithm for testing and CNN will predict the fused image is either normal or having tumor, if tumor then how much percentage confidence.
Future this application can be integrated with android application to make this application available to everyone.
References
[1] Ghori, Khawaja Waqas Ur Rehman. \"Multimodal Deep Learning for Lungs Cancer Detection: Integrating Audio and Image Analysis with Web-Based Accessibility.\" PhD diss., Dublin, National College of Ireland, 2025.
[2] Malik, Hassaan, and Tayyaba Anees. \"Multi-modal deep learning methods for classification of chest diseases using different medical imaging and cough sounds.\" Plos one 19, no. 3 (2024): e0296352.
[3] Al-Hamadani, Samer. \"Intelligent Healthcare Imaging Platform An VLM-Based Framework for Automated Medical Image Analysis and Clinical Report Generation.\" arXiv preprint arXiv:2509.13590 (2025).
[4] Kingsley, Nwizua Felix, and Amannah Constance Izuchukwu. \"Optimization of Medical Image Analysis Models for Effective Disease Diagnosis through Data Augmentation Techniques.\" Journal of Infectious Diseases and Patient Care (2025).
[5] Kaliappan, M., E. Mariappan, V. Manimaran, and B. Revathi. \"Unified Vision Transformer for Multimodal Medical Image Classification.\" Available at SSRN 5281786.
[6] Panayides, Andreas S., Amir Amini, Nenad D. Filipovic, Ashish Sharma, Sotirios A. Tsaftaris, Alistair Young, David Foran et al. \"AI in medical imaging informatics: current challenges and future directions.\" IEEE journal of biomedical and health informatics 24, no. 7 (2020): 1837-1857.
[7] Singh, M., & Azam, M. A Review of Multimodal Medical Image Fusion Techniques. Journal of Medical Systems, 2020. Wiley Online Library
[8] Trägårdh, E., et al. RECOMIA — a cloud-based platform for artificial intelligence research in medical imaging. EJNMMI Physics, 2020. SpringerOpen
[9] Egger, J., et al. Studierfenster: an open science cloud-based medical imaging platform for visualization and analysis (CT, MRI). Journal of Digital Imaging, 2022. SpringerLink
[10] Saleh, M. A. A Brief Analysis of Multimodal Medical Image Fusion: techniques and assessment. Electronics, 2022. MDPI
[11] Liu, S., et al. Two-Scale Multimodal Medical Image Fusion Based on Spatial-Frequency Methods. Frontiers in Computational Neuroscience, 2022. Frontiers
[12] Mei, X., et al. RadImageNet: an open radiologic deep learning pretraining dataset and models. Radiology: Artificial Intelligence / RSNA, 2022. Radiological Society of North America
[13] [13] Li, M., et al. Medical image analysis using deep learning algorithms: a review (Frontiers). 2023. Frontiers
[14] [14] Jozi?, K., et al. DICOM SIVR: a web architecture and platform for seamless image and volume rendering of DICOM images. Journal/Conference, 2022. ScienceDirect
[15] [15] Min, Q., et al. Web-Based Technology for Remote Viewing of Radiological Images. Journal of Medical Internet Research, 2020. JMIR Publications
[16] [16] AboArab, M. A. Advancing Progressive Web Applications to Leverage Medical Imaging: a modular PWA design for DICOM and MPR visualization. JMIR Medical Informatics, 2024.